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To Paul Schupp, with the greatest affection 

Abstract. We start by studying the distribution of (cyclically reduced) elements 
of the free groups F„ with respect to their abelianization (or equivalently their 
class in Hi(F„,Z)). We derive an explicit generating function, and a limiting 
distribution, by means of certain results (of independent interest) on Chebyshev 
polynomials; we also prove that the reductions mod p (p - an arbitrary prime) of 
these classes are asymptotically equidistributed, and we study the deviation from 
equidistribution. We extend our techniques to a more general setting and use them 
to study the statistical properties of long cycles (and paths) on regular (directed and 
undirected) graphs. We return to the free group to study some growth functions 
of the number of conjugacy classes as a fimction of their cyclically reduced length. 



Introduction - 2010 

The paper "Growth in free groups (and other stories)", has been around in 
preprint form ([45J) since the late nineties (the arXiv version cited dates to 1999, 
but this was preceded by a 1997 IHES preprint). Since the paper has had a fair 
amount of influence (and parts of it have since become separate papers), it seems a 
good idea to publish it at last - this version is not very different from the preprint, 
except for this introduction, which gives a bit of background on how and why 
it was written together with a survey (necessarily incomplete and subjective) of 
what has happened since the arXiv preprint appeared in 1999. 

Why? The work described in the paper was initially motivated by the author's 
(continuing to this day) interest in the counting questions on geodesies on hyper- 
bolic surface, stemming from some conversations with Peter Samak in the early 
1990s. More precisely, Sarnak had asked about the asymptotics of the number 

Date: June 30, 2011. 

1991 Mathematics Subject Classification. Primary 05C25, 05C20, 05C38, 60J10, 60F05, 42A05; Sec- 
ondary 22E27. 

Key words and phrases, graphs, groups, growth function, homology, Chebyshev polynomials, 
asymptotics, limiting distributions, perturbation theory, compact groups, geodesic flow, Markov 
chains. 



2 



IGOR RTVIN 



of simple geodesies on the punctured torus, where the only result appeared to be 
the one in the paper of Beardon, Lehner, and Sheingom |l3|], where the authors 
had shown that the number of simple geodesies of length bounded by L grew 
somewhere between quadratically and quartically in L. This did not seem to be 
very sharp, and indeed, Greg McShane and I improved it to an asymptotic result 
(with quadratic growth) in a pair of short papers ||39| l38ll , using purely geometric 
methods (showing that the length of the unique shortest geodesic (which can be 
showed to be simple) in a primitive integral homology class extends to a norm 
on real homology (which is the Gromov, or the stable norm, though at the time 
McShane and I had no knowledge of the connection). The fact that there is at most 
one simple closed geodesic in a homology class is specific to the punctured torus, 
and while other methods can be used to compute the asymptotics of the number 
of simple closed geodesies of bounded length on a surface of finite type (the order 
of growth was computed by the author in [46J , while asymptotics were computed 
by Mary am Mirzakhani in ||40| - see also ||49l ), the following question is still wide 
open: 

How many simple curves of length bounded by L are there in a fixed homology 
class h on a hyperbolic surface? Mirzakhani's work implies that a constant 
proportion of all simple geodesies are separating, but for a non-trivial homology 
class nothing seems known to-date. 

Geodesies in homology classes. Given the interest in geodesies and homology, it 
was natural to investigate a similar question for all closed geodesies, not necessarily 
simple. It is a well-known result of Huber (for hyperbolic surfaces - Huber uses the 
Selberg Trace Formula) - IITSl [1911201 and Margulis [1351136 1 for arbitrary negatively 
curved surfaces, using ergodic theory) that the number of closed geodesies of 
length bounded by L without homological restrictions is asymptotic to exp hL / (hL), 
where h is the topological entropy of the geodesic flow {h = 1 for a hyperbolic 
surface). The methods used by Huber and Margulis (Selberg Trace Formula and 
ergodic dynamics, respectively) are the two principal tools used in the vast majority 
of the paper discussed below (generally either one technique or the other, but not 
both, generally because the Trace Formula gets sharp results but only works in the 
constant curvature setting, while dynamical methods are softer, so give weaker 
results in a wider setting. 

The first result on geodescis in homology classes is due to W. Parry and M. 
PoUicott - in their paper [41] they show that when the homology group Hi(S, Z) is 
finite, then closed geodesies are equidistributed among homology classes. Parry 
and PoUicott use the machinery of thermodynamic formalism and dynamical 
zeta functions, and their argument mimics the proof of the Chebotarev density 
theorem. Parry and PoUicott's methods work in variable negative curvature, and 
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they also analyze the lifting of geodesies in a homology class to (finite) Galois 
covers. Roughly concurrently, A. Katsuda and T. Sunada showed in [28] that for 
homology with coefficients in a finite group, every homology class contains an 
infinite number of closed geodesies (but no estimate of the growth of their number 
as a function of length). 

The next result is due to T. Adachi and T. Sunada - in the paper [1] they show 
that the exponential growth rate of the number curves in any homology class is 
equal to h (just like for homologically unrestricted geodesies) - they use Markov 
partitions as introduced by R. Bowen in [5l and use results on paths in finite 
graphs to get the result (which is rather weak, since they don't actually get an 
asymptotic result. They point out that getting such a result (via the usual Tauberian 
machinery) would require an understanding of the singularity of the L-functions 
involved greater than they could produce at the time.They conjecture that the the 
number of geodesies of length bounded by L in a homology class should grow like 
exp{hL)/{L'^'^^), where b is the first Betti number of the manifold. 

This conjecture turns out to be false - in the paper [44], published almost simul- 
taneously with [1], R. Phillips and R Sarnak give an asymptotic expansion valid 
for a hyperbolic surface: the number of closed geodesies in a fixed homology class, 
of length bounded by L grows as 



where Ci, . . . ,Ck, . . . depend on the homology class. This sort of expansion ap- 
peared (at the time) to be possible only because the manifold had constant negative 
curvature. The work of Phillips and Sarnak was extended (again, approximately 
at the same time) by C. L. Epstein to cusped surfaces in ||8l, again using the Selberg 
Trace Formula. As often with these kinds of extensions, the result is a lot harder 
technically than the Phillips-Samak result. 

At roughly the same time, A. Katsuda and T. Sunada extended the dynamical 
methods of [1] first to surfaces of constant negtive curvature in [29] (by observing 
that the complicated L-function that could not be dealt with in [1] became much 
simpler in constant curvature), and then for general negatively curved surfaces in 



Last, but not least, S. Lalley uses the thermodynamical formalism and some 
fairly intricate harmonic analysis in p3]l to recover the results of Katsuda-Sunada, 
and more: He shows a central limit theorem for the distribution of homology 
classes of closed geodesies, and also a "large deviation result". Lalley's result is 
closest in spirit to the current paper, but the methods are completely different (and 
I had no knowledge of the paper's existence until this writing). 




(I+C1/L + C2/L2 + •••), 



lEol. 



4 



IGOR RIVIN 



Some motivation. All of the results mentioned in the survey above are technically 
quite involved, and it was not clear what was really going on. This is what 
gave birth to the current paper. One observation was that it is a lot easier to 
work with groups (especially free groups) than with surfaces, and secondly, since 
fundamental groups are often quasi-isometric to the spaces they are fundamental 
groups of, one has the hope of obtaining "universal" results (that is, a result for a 
surface group implies a result (usually somewhat weaker) for every surface of the 
appropriate type. 

One particular insight (on which much of the paper is based) is the observation 
that for graphs, the Selberg Trace Formula (quite pervasive in the work surveyed 
above) is a triviality: the number of closed (based) cycles of length N in the graph 
is the trace of the N-th power of the adjacency matrix, and thus the sum of the Nth 
powers of eigenvalues of the adjacency matrix of the graph. In the particular case 
where the graph is undirected, the adjacency matrix is symmetric, and analysis 
becomes easy. Technically simpler methods (based in large part on perturbation 
theory for eigenvalues) have helped to get results of much wider scope than 
previously. Let us now review the results and their follow-up in subsequent years. 



Then what happened? Free groups and related subjects. In Section [T] we have 
set up the basic model, and used it to count cyclically reduced words in a free 
group. The basic method works for any automatic group, and if the structure 
is bi-automatic, we similarly get an undirected graph. Somewhat surprisingly, 
the count of cyclically reduced words has been used in a number of papers (see, 
eg, |l25l[7|), and in the paper [31] by L. M. Koganov it is shown that the formula 
is equivalent to H. Whitney's formula for the chromatic polynomial of the cycle 
graph. Koganov had apparently published two other papers (in 2002 and 2004) 
deriving the enumeration of cyclically reduced words - see references [1] and [2] 
in El. 

A related question is considered in Sections [131, [HI [151 where we study the num- 
ber of conjugacy classes of fixed minimal length in the free group (and elsewhere). 
We construct an ordinary generating function (in the form of a Lambert Series, 
see |[T6l for definition), which turns out to be horribly irrational (this result has 
gone on to have a life of its own in ||47|), and the zeta function enumerating prim- 
itive conjugacy classes, which turns out to be an Ihara-type zeta function of the 
defining graph (see also the papers of Stark and Terras |[59ll60ll6Tl). The conjec- 
ture that the (standard) generating function is irrational for all non- virtually-cyclic 
Gromov-hyperbolic groups is still open. The Ihara zeta function immediately 
gives asymptotic growth rates for primitive classes, however this is computed 
again by M. Coornaert in [iZJ. 
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In Section |2] we write down explicit generating functions for the number of 
elements in the free group with a given abelianization. These formulas can be 
expressed as Chebyshev polynomials - this is so, because the adjacency matrix of 
the "recognizing automaton" graph has only two non-trivial eigenvalues, and this 
is special to free groups. It would be interesting to write down formulas of this 
type for, eg, surface groups, and see what special functions arise. 

The fact that certain variations on Chebyshev polynomials arise as generating 
functions give previously unknown positivity result on combinations of their 
coefficients and shows that the functions T„(c cos x) and U„{c cos x), where T and 
U are Chebyshev polynomials of first and second kind respectively, and c > 1 
are positive semi-definite in the sense of Bochner. This, and the central limit 
theorem for the coefficients of "Symmetrized Chebyshev Polynomials" appear in 
the author's paper [48J. 

The Central Limit theorem for distribution of elements of the free group F„ is 
proved in Section 3, but the methods actually go through without much change to 
prove a "Local Limit Theorem". Such a theorem was also shown by R. Sharp, using 
much more heavy lifting in his paper [57]. The central limit theorem was reproved, 
together with some variants of results of Phillips-Samak, Adachi-Sunada, and 
Katsuda-Sunada in Petridis and Risager's papers ||42l |43l. The methods of [42l 
involve perturbation theory, and so are similar to those of the current paper. 
Results of [42J are closely related to those of [|24 | 1 - in that paper we show (using 
the ergodicity of the SL (n, Z) action on R" and the Central Limit Theorem for free 
groups in the current paper) that some probabilistic phenomena in the free group 
f „ can be studied by descending to the abelian quotient. 

The Central Limit Theorem has been extended in other ways as well: D. Calegari 
and Koji Fujiwara proved a central limit theorem for the values of bicombable 
functions on word-hyperbolic groups in [6j, using Markov chain methods, while 
M. Horsham and R. Sharp extended the results to quasi-morphisms of free groups 
by using the usual symbolic dynamics and thermodynamic formalism in [17J. 

Lest one think that every function of interest on free (or word-hyperbolic) group 
satisfies a central limit theorem, we should note the results of Guivarc'h-Lejan 
(imi [151 )and Vardi ([62J), which show that the the distribution of lengths of 
geodesies on the modular surface satisfies a stable law of Cauchy type. 



Then what happened? Walks on graphs. In Sections [6[ and \6A\ we look at homol- 
ogy modulo a prime p and derive the expected equidistribution results (and also 
the analogue of Chebyshev bias, see 1541 , which in this case is completely explicit). 
More importantly, however, a study of the argument showed that instead of a finite 
abelian group we can take any compact (in particular, any finite) group - the har- 
monic analysis goes through, although with some more work. The arguments in 
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this paper are a little sketchy, but are presented in full detail in my papers BSOllSTII . 
These papers, together with ||52B are devoted to proving that certain phenomena in 
algebraic groups, as well as "geometric" groups, like the mapping class group and 
the outer automorphism group of the free group (and a large class of subgroups) 
are generic (which means that in large subsets of the groups in question, the vast 
majority of elements have a certain property - see 1251 for other examples). The 
way the results of the current paper are used is essentially through a "Chinese 
remaindering" argument - if a certain property does not hold for some fraction of 
the elements in the projection of an algebraic group (scheme) over Z/pZ, then it 
does not hold generically in the group over Z. Using property T and a more refined 
analysis (as in liSTI ) give estimates of convergence speed. The appearance of the 
paper [50J is responsible the subsequent appearance of E. Kowalski's book [32l, 
where these rather simple ideas are couched in a rather formidable apparatus. 

Then what happened? Topological entropy. In the mid-to-late 1990s, the spec- 
tacular results of G. Besson, G. Courtois, and S. Gallot on "volume rigidity" of 
locally symmetric spaces (see [4]) were generating a lot of excitement. The result 
was that among all the metrics of a given volume on a hyperbolic manifold, the 
metric of constant sectional curvature minimizes volume entropy - this answered 
a conjecture of Gromov stated in [|13i , and previously known only in dimension 
two (thanks to A. Katok's result [|27|). Any time a function has a single minimum, 
there is a suspicion that some sort of convexity is afoot, and entropy in the sim- 
plest setting (see, for example, [|56l ) is a convex function of the probabilities, and 
this pushed the author to analyze topological entropy for walks on graphs as a 
function of weights on the vertices in Section [TTl The methods are again those of 
perturbation theory. Later, the result was extended to edge weightings by S. Lim 
in 1341 . Lim does not prove convexity, but does write down the unique metric 
of minimal entropy. A related minimality result is proved by I. Kapovich and T. 
Nagnibeda in their paper [22J for regular graphs (their work has its roots in the 
study of Outer Space. In a different direction, the convexity of entropy was used 
by I. Kapovich and myself in ||23l to show that there is no analogue to McShane's 
identity in OuterSpace. 

Introduction 

In this paper we begin by studying certain growth functions of the free group 
f ,-, related to well-studied questions on the growth functions of geodesies on 
manifolds. The free group is a relatively simple combinatorial object, and this 
allows us to get fairly complete answers to our questions. Our techniques, which 
are quite elementary, allow us to get precise results on the distribution of elements 
in as a function of their abelianization and in terms of their abelianization mod 
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p. Our techniques turn out to be easily extensible to the study of paths in graphs 
with coefficients in compact groups. 

Here is an outline of the paper: In Section [T] we set up an equivalence between 
counting cyclically reduced words on the free group F,- and counting circuits on an 
associated graph which, in turn, involves understanding the spectrum of the 
adjacency matrix of Qr (of course the answer is easily obtained, and is well-known; 
for convenience we state it as Theorem We use this framework to obtain a 
generating function for the number of elements of a fixed cyclically reduced length 
with prescribed abelianization (or homology class). This turns out to be essentially 
a Chebyshev polynomial of the first kind; see Definition 12.21 of the function R,- 
and Theorem 12.31 (a very brief introduction to Chebyshev polynomials is given 
in Section [S]). The fact that the function Rr{c;x) (at least for some special values 
of the parameter c) is a combinatorial generating function implies a previously 
unnoticed positivity result on Chebyshev polynomials; this result is generalized 
in Section |4] in Theorems 14.11 and 14. 2[ Theorem 12.31 is used in Section |5] to derive 
a limiting distribution (as n tends to infinity) of cyclically reduced words length 
n among the possible homology classes. From the analytic standpoint this is also 
a qualitative result about Chebyshev polynomials, complementing the positivity 
Theorems 14.11 and 14.21 In Section [6] we show that if we study homology mod p, 
then the cyclically reduced words in F,- are asymptotically equidistributed among 
the p*" classes in Hi{Fr,Z/pZ). We also succeed in estimating the extent to which 
the cyclically reduced words in F,. are not equidistributed mod p (Section 

While the results in Sections |5] and [6] seem to depend on the explicit generating 
function that we have obtained, in Section [7| we show that our techniques are 
more general, and use them to study the equidistribution properties of long walks 
on regular graphs - we obtain a complete answer (Theorem 17.1} - and, without 
any change, closed orbits of irreducible primitive Markov processes (with a finite 
number of states). The arguments use elementary perturbation theory and the 
necessary technical results are contained in Section [TOl 

In Section HI we extend our methods to study the functions defined on the edges 
of a graph, and as an application we derive the statistical properties of long walks 
without backtracking on the edges of an undirected graph. 

We apply our methods to derive equidistribution results for long walks with 
coefficients in compact groups in Sections 17.11 and HI Our results are completely 
explicit, in that knowing the irreducible representations of the group in question 
allows us to obtain complete asymptotics for the convergence to uniformity. Our 
results also apply, via the construction of a directed edge graph to the statistics of 
"geodesic", that is, backtrackless paths (Section H]). This, in turn, implies a result 
on the statistical properties of "primitive" orbits of Markov processes as above. 
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In Section [12] we point out real and philosophical applications of the above 
mentioned result to group theory (where this all started) and geometry. 

Finally, in Sections [T3lll4.ll we derive a relationship between the number of 
cyclically reduced words and the number of conjugacy classes of bounded length. 
While the generating function of the first is a rational function, the generating 
function of the second is the integral of a Lambert series with an infinite number 
of poles. These results are then extended to a slightly more general case than 
that of free groups. We then (in Section [I5j) compute a zeta function for primitive 
conjugacy classes, and show that this is a rational function. 



1. A MODEL AND A GENERATING FUNCTION 

Let G be the free group f = (ui, a,,), and let g G G be an element. The defining 
property of G is that g is uniquely represented by a reduced word in fli, . . . , a^/ that is, 
a word where a, is never adjacent to aj^ (Notation: in the sequel we shall write W 
for w~^). We observe that such words over the alphabetfli, Ai, . . . , A„ are, in turn, 
be generated by walks on the graph constructed as follows: Q„ has 2r vertices, 
labelled with the symbols a^, . . . ,ay,Ay, . . . ,Ai - this peculiar order will simplify 
notation later. The vertex corresponding to fl, is connected by an edge to every 
vertex except A,. In particular, there is a loop joining a, to itself (so that Q,. is not 
a simple graph). A walk V1V2 ■ ■ - Vu gives the word V2 - ■ ■ Vk, so the correspondence 
between walks and words is a 2r - 1-to-l mapping. Note, however, that if we 
restrict our attention to closed walks (circuits with basepoint) on Q,-, then those 
are in bijective correspondence with cyclically reduced words in G. In the sequel we 
will be interested exclusively with cyclically reduced words. 

1.1. Counting cyclically reduced words. To count cyclically reduced words, then, 
we need to count circuits in This is a well-understood problem: If yi,. is the 
adjacency matrix of Q,.> then the number of circuits of length k is equal to the trace 
of yij. To compute this trace we must compute the spectrum of and to do this, 
it is better to write = }2r - Pr, where /n is an N x N matrix all of whose elements 
are 1 and Pr is the 2r x 2r matrix such that 



(Pr), 



fl, ifz + ;=2r; 
lO, otherwise. 



In order to compute the spectrum of we note first that the matrix }2r has rank 
1. The kernel of }2r is 



2r 



ker }2r = {{vi, V2r) 12],^' = 



GROWTH IN FREE GROUPS (AND OTHER STORIES)-TWELVE YEARS LATER 



9 



while the vector 1 = (1, . . . , 1) is the eigenvector of eigenvalue 2r. 

The spectrum of P,- is not much more difficult to compute: The vector 1 is the 
eigenvector of Pr as well as of }2r, this time with eigenvalue 1. To compute the rest 
of the spectral decomposition, let x be an eigenvector of P,- orthogonal to 1, and let 
A be the corresponding eigenvalue. Then we have the following set of equations: 

2r 

7=1 

Xj = Ax2r-j+i, ; = 1, . . . 2r. 

Since at least one of the Xj is not equal to zero, we see that = 1, so A = ±1. The 
orthogonality condition Eq. can be rewritten as + = 0. Suppose 

A = -1. Then, Eq. holds a forteriori, and so the eigenspace of of -1 is r- 

dimensional. On the other hand, if A = 1, then we have the additional constraint 
that TJj^i = 0, so the eigenspace of 1 is n - 1 dimensional. Putting this all together, 
we see that the spectrum of the adjacency matrix is (2r - 1, 1, . . . , 1, -1, . . . , -1). 

r r-1 

We see therefore: 

Theorem 1.1. The number of cyclically reduced words of length m in Fr is equal to 
{2r - 1)'" + 1 + (r - 1)[1 + (-1)"']. 

2. Counting cyclically reduced words in homology classes 

Recall that the abelianization of f is Z'', generated by the classes of [^i], . . . , [a^] 
of fli, . . . , flr respectively. To compute the homology class of a word w in f ^ we 
simply count the total exponents ei{w), . . . , er{w) of the generators used to write w. 

Then, [w] = ei{w)[ai] H h er{w)[ar]. In this section we will compute the following 

generating function: 

where the sum is taken over the set of all cyclically reduced words w in 
fli, . . . , fl,., Ai, . . . , A,, of length k. 

To compute , we return to circuits in Qy. Given a circuit c = Vi, . . . ,V};, v^^+i = 

Vi, the contribution of c to is the monomial ntc given by the following iterative 
procedure: we start with 1, every time we see the vertex fl„ we multiply rtic by x„ 
and every time we see A„ we multiply rtic by 1 /x,. From this, it follows that: 
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Theorem 2.1. The Laurent polynomial is given by tr Bf, where Br = Dr^r, where, 
in turn, 

(Xi ^ 



Xn 



1/ Xfi 



Computing the trace of seems daunting at first, but one can use the approach 
we have used to prove Theorem ll.il 
First, note that 

B, = DrJlr = Drhr - D,.Pr. 

Evidently, the rank of Drjir is still equal to 1, and 

2r 

ker Dr}2r = {V = (Vi, V2r) \Y^Vj= 0} 

7=1 

Note further that an eigenvector v of DrPr, such that v G ker 0,-/2,-, with associated 
eigenvalue A, is also an eigenvector of B,., with associated eigenvalue -A. To find 
such an eigenvector, we must solve the system of equations: 



7=1 



Vj = 



Avj = V2r-j+i/Xj, i < r 
Avj = V2r-j+iXj, i > r. 

We find, as before, that A = ±1. The first equation reduces (almost as before) to 



Y^Vjil + Axj) = 0, 



7=1 

SO that the eigenspaces of both 1 and -1 are (r - l)-dimensional. What are the two 
remaining eigenvalues /^i and /.i2 of B,.? Note that since det = 1, we know that 
detB, = detJlr- Note now that detB,- = fiifi2(-l)'-^ while det^l^ = (2r - l)(-l)'-i. 
So 



(2.1) 



/Jif/2 = 2r - 1. 
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On the other hand, 

(2.2) 1^1 + ^2 = tr B, = > (xy + -). 

Denoting yr = j + 1/^;)/ we see that /.ti, fi2 are the two roots of the equation 

- 2yrZ + (2r - 1) = 0, so that: 

fii = - ^y?-(2r-l), 

li2 = yr+ ^y?-(2r-i). 

The trace of is then equal to fi^ + + - + (-l)*^]- This can be expressed in 

terms of well known special functions, if we make the substitution y,. = V2r - ly[. 
Then, 

^\ =(2r-lF(y;- ^^^)', 
=(2r-lF(y;+^y;2-l)', 

and so 

+ = (2r - If' |(y; - Ty^f + [y'r + V^J^)} 

= 2( V27^)'^r,(y;), 

where T^ix) is the fc-th Chebyshev polynomial of the first kind. To simplify notation 
in the sequel, we define: 

Definition 2.2. 

R„{c; xi, . . . , X,) = Tn E + i)) 
S„(c; Xi, . . . , x^) = Un XLi (^( + ^)) • 
And to summarize: 

Theorem 2.3. T/ze number of cyclically reduced words of length k in Fy homologous to 
ei[ai] H h is equal to the coefficient ofx^^ ■ ■ -Xy' in 

(2.3) 2 ( V2r-lf Ru{ / ; Xi, . . . , x,) + (r - 1)[1 + (-l)'^] 

Remark 2.4. The rescaled Chebyshev polynomial Tk{ax) /a^ is called the fc-th Dickson 
polynomial Ti{x,a) (see l|55l ). 
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3. Some facts about Chebyshev polynomials 

The literature on Chebyshev polynomials is enormous; [i53il is a good to start. 
Here, we shall supply the barest essentials in an effort to keep this paper self- 
contained. 

There are a number of ways to define Chebyshev polynomials (almost as many as 
there are of spelling their inventor's name). A standard definition of the Chebyshev 
polynomial of the first kind T„{x) is: 



(3.4) T„ix) = \ ^ 



(3.1) T„(x) = cos n arccos x. 
In particular, Tq{x) = 1, Ti{x) = x. Using the identity 

(3.2) cos(x + y) + cos{x -y) = 2 cos x cos y 

we immediately find the three-term recurrence for Chebyshev polynomials: 

(3.3) Tn+i{x) = 2xTn{x) - T„_i{x). 

The definition of Eq. (|3.1|) can be used to give a "closed form" used in Section |2l 

(x - Vx2 - l)" + (x + Vx2 - l)" 

Indeed, let x = cos 6. then (x - Vx^ - 1^ = exp{-inO), while (x + Vx^ - l) = 

exp(m0), so^ix- Vx^ - l)" -I- -I- Vx^ - l)" = ^exp(m0) = cosn0. 

Though we will not have too many occasions to use them, we also define 
Chebyshev polynomials of the second kind Un{x), which can again be defined in a 
number of ways, one of which is: 

(3.5) Unix) = -l^r;,^(x). 

A simple manipulation shows that if we set x = cos 6, as before, then 

7 w ^ sm(n + 1)0 

(3.6) Unix) = — — . 

smy 

In some ways, Schur's notation tin = Un-i is preferable. In any case, we have 
Uoix) = 1, Uiix) = 2x, and otherwise the L/„ satisfy the same recurrence as the T„, 
to wit, 

(3.7) Un+iix) = 2xU„ix) - Un-iix). 

From the recurrences, it is clear that for f = T,U, fni-x) = (-l)"/(x), or, in other 
words, every second coefficient of T„(x) and Unix) vanishes. The remaining coeffi- 
cients alternate in sign; here is the explicit formula for the coefficient c^"^2m x""'^'" 
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of Tn{x) : 



(3.8) ci"),„, = (-ir^r '^W'-i, m = 0,l,... 

This can be proved easily using Eq. (|3.3|) . 

4. Analysis of the functions Rn and S„. 

In view of the alternation of the coefficients, the appearance of the Chebyshev 
polynomials as generating functions in Section |2] seems a bit surprising, since 
combinatorial generating functions have non-negative coefficients. Below we 
state and prove a generalization. Remarkably, Theorems 14.11 and 14.21 do not seem 
to have been previously noted. 

Theorem 4.1. Let c> 1. Then all the coefficients ofR„{c;x) are non-negative. Indeed the 
coefficients ofx", x"~^, . . . , x~"'^^, x~" are positive, while the other coefficients are zero. The 
same is true ofSn in place ofR„. 

Proof. Let a^ be the coefficient of x^ in U„{{c/2){x + 1/x)). The recurrence gives the 
following recurrence for the fl^ : 

(4.1) al, = c{ai-' + ^:') - al,. 

Now we shall show that the following always holds: 

(a) : fl^ > (inequality being strict if and only if n - ?c is even). 

(b) : fl^ > max(fl^^^^, fl^ti)/ the inequality strict, again, if and only if n - is even. 

(c) : fl^ > fl^_2 (strictness as above). 

The proof proceeds routinely by induction; first the induction step (we assume 
throughout that n - A: is even; all the quantities involved are obviously otherwise): 



+1 



> 



By induction fl^ ^ < min(fl^ ^,a^:^'^), so by the recurrence 14. II it follows that fl^ 
max(flf^"^,fl^+^). (a) and (c) follow immediately. 

For the base case, we note that flg = 1, while a\ = a~^ = c > 1, and so the result 
for Un follows. Notice that the above proof does not work for r„, since the base 
case fails. Indeed, if is the coefficient of x'^ in T„((c/2)(x + 1/x)), then = 1, while 
b\ = c/2, not necessarily bigger than one. However, we can use the result for Un, 
together with the observation (which follows easily from the addition formula for 
sin) that 

(4.2) T„ix) = ^n{x) -U„.2{x) 

Eq. (|42l) implies that = a^- a^~^ > 0, by (c) above. □ 
The proof above goes through almost verbatim to show: 
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Theorem 4.2. Let c> 1. Then all the coefficients ofRn are non-negative. The same is true 
ofSn in place ofR„ 

To complete the picture, we note that: 

Theorem 4.3. 

J^„(l;x) = i(x" + 1). 

Proof. Let x = expid. Then l/2(x + 1/x) = cos d, and = T„{l/2{x + 1/x)) = 

cosnd = l/2{x" + l/x"). □ 

Remark 4.4. For c < -1 it is true that all the coefficients of K„(c; .) and S„(c; .) have 
the same sign, but the sign is (-1)". For |c| < 1, the result is completely false. For c 
imaginary, the result is true. I am not sure what happens for general complex c. 

By the formula (|3.8|l , we can write 

m=0 ^ ' 



Noting that 

we obtain the expansion 

(4.5) K,,.,...^^X(4f-£-(» 



' ^ - '--mW n-2m i 
m )\{n-2m-k)/2l' 



k=-n m=0 

where it is understood that (|^) is if & < 0, or & > a, or & ^ Z. We shall denote the 
coefficient of by t{n, k, c). 

5. Limiting distribution of coefficients 

While the formula (|4.5|) is completely explicit, and a similar (though somewhat 
more cumbersome) expression could be obtained for R„(c;xi, . . .,Xjc)/ for many 
purposes it is more useful to have a limiting distribution formula as given by 
Theorem 15.11 below. To set up the framework, we note that since all the coeffi- 
cients of R„{c; Xi, . . . , Xi) are non-negative (according to Theorem I4.2|), they can be 
thought of defining a probability distribution on the integer lattice 7.^, defined by 
p(li, ...,h) = [x'^x^ ■ ■ ■ x'l^]Rn{c; X\,..., Xk)/Rn{c; !,...,!) (where the square brackets 
mean that we are extracting the coefficients of the bracketed monomial). Call the 
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resulting probability distribution Pn{c',2), where z now denotes a fc-dimensional 
vector. 

Theorem 5.1. When c > 1, the probability distributions !P„(c;z/ V^) converge to a 
normal distribution on whose mean is 0, and whose covariance matrix C is diagonal, 
with entries 



o 



1 + 



c-1 



To prove Theorem lS.ll we will use the method of characteristic functions (Fourier 
transforms), and more specifically at first the Continuity Theorem ( IITOl Chapter XV.3, 
Theorem 2]), 

Theorem 5.2. In order that a sequence {F„} of probability distributions converges properly 
to a probability distribution F, it is necessary and sufficient that the sequence {(p„} of their 
characteristic functions converges pointwise to a limit (p, and that (p is continuous in some 
neighborhood of the origin. 

In this case (p is the characteristic function of F. (Hence (p is continuous everywhere 
and the convergence (pn — > (p is uniform on compact sets). 

The characteristic function (p„ of ^„(c; z) is simply 

Rn{c; exp(z0i), . . . , exp(z0,c))/R„(c; I,..., I), 

since the characteristic function is just the generating function evaluated on the 
unit circle. 

By definition of Rn, 

( k ^ 

R„{c; expiidi), exp{id^)) = T„ 



^2]^ cos dj 



f k 

I 

7=1 



^^cosO 



T„ic). 



R„(c;l,...,l)) = r„ 
We now use the form of Eq. (|3.4[) : 

Tn{x) = \[(x- - l)" + (x+ Vx2-l)"j, 



setting 



we get 



w=> cos— , d = {di,...,dk), 
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(5.1) (p„{d/^) 



1 1 



T„{c) 2 



1 

^2 



Notice, however, that for c > 1, the ratio of the second term in braces to the first 
is exponentially small as n — > cxi, since the first term grows like (c + Vc2 - If, 

while the second as (c - Vc^ - 1)" (since cos — > 1). Since, for the same reason, 
2T„{c) = {c+ Vc2 - 1)"[1 + 0(1)], we can write: 



^n(-^) 



u + yj^u^ - 



c + 



+ 0(1). 



Substituting the Taylor expansions for the cosine terms (hidden in u for typesetting 
reasons), we get: 



(5.2) 

so 



w = fc + — <0,0) + o(l/n), 
2n 



^w = c + ^{0,0) + o(l/n). 



^u^ = c^ + ^{d,d) + o{l/n). 



(5.3) 

A similar computation gives 
(5.4) 

Substituting the last expansion into the square root, we see that 

= V?^[l + ^(^<0,0)]+o(l). 
Adding Eq. (|5.3|) and collecting terms, get 



(5.5) 



U + JtjU^ - 1 



1 



1 + — 1 + 



1 



C + Vc2 - 1 

Performing some further simplifications, we see that 



c ^ 



„(-^) = exp(-i0*C0) + o(l), 



GROWTH IN FREE GROUPS (AND OTHER STORIES)-TWELVE YEARS LATER 



17 



where C is the covariance matrix described in the statement of Theorem |5.1[ and 
Theorem 15. 1 1 follows immediately. 



Remark 5.3. The speed of convergence in Theorem 15.11 can be estimated using 
standard technology (see [10, Chapter XVI], [58, Chapter IILll]), but the speed of 
convergence in practice (as checked by numerical experiments) seems to be much 
better than the general estimates. Indeed the O difference between !P„ and the 
normal distribution appears to decrease almost exactly linearly in n. 

6. Distribution mod p 

The explicit generating functions derived above can be used to study the distri- 
bution of cyclically reduced words in with respect to their mod p-homology 
class (this is the analogue, in this setting, of the work of ||44||). 

Theorem 6.1. Let hi and hz he two elements of H-i{Fr,Z/pZ) = Z/pZ*", and let Wr,„,hi 

and Wr,n,h2 he the numbers of cyclically reduced words in Fy homologous to hi and h^, 
respectively. Then, 

(6.1) hm — = 1. 



Proof. By elementary algebra (in one dimension, formula (|6.3|), the statement of 
theorem is equivalent to the statement that 

(6.2) lim ^ = 0, 

n^oo (^,,(0) 

for 6 = (2ni7i/p, . . .,2n,-7i/p), with not all nj equal to mod p, where (pn is the 
characteristic function defined in the previous section. 

The estimate of Eq. (|6.2|) , however, follows immediately from the explicit for- 
mula (|5.1|) : indeed, in the current context. 



= cos{2n ju/p), 

which is strictly smaller than u(0), so the ratio of (pnid) to (^„(0) goes to zero 
exponentially fast in n. □ 

Remark 6.2. Another way to see the equivalence of statements (|6.1|) and (|6.2|) is 
though the well-known fact that the Fourier transform is an isometry (of the 
corresponding spaces). For a probability density to be close to uniform, its 
Fourier transform has to be close to that of the uniform distribution, which is a 
delta function centered at the origin, which is precisely the statement we need. 
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6.1. Deviation from uniformity. Although the distribution of homology mod p 
approaches uniformity, it turns out that there is a persistent bias in favor of certain 
homology classes. This is very much akin to the Chebyshev bias, analyzed in 
|54|. To simplify the discussion we project one more time: for each cyclically 

reduced word in homologous to a'^a^ . . .a^' we consider ki -\ + A:,, mod p. In 

this case we have a univariate distribution, whose generating function is given 
by i/'„(x) = Rn{c;x, . . . ,x), with c = -;j=f (as per formula (|2.3t we leave in the 
general c, to underline that our results apply to general question on distribution 
of coefficients of the Laurent polynomials 

The number of elements congruent to q mod p is given by 



(6.3) 



1 ^'^ 

P 7=0 



where x = exp{2ni/p) is a primitive p-th root of unity. Let us recall that 

1 f 1 / I \" 1 / 

— — { - ccosx + Vc^cos^x - 1 +- 

n{C) 12 ^ / 7 V 



(6.4) 4^n{e'') 



C COS X 



COS^ X 



r, 



Note the following properties of the function 



(6.5a) 
(6.5b) 
(6.5c) 
(6.5d) 

(6.5e) 



\p„{l/x) = \pn{x), 

If ccosx < 1, then |i/'„(exp(zx))|r„(c) < 1. 

Xl^n {exp(z(7T - X))} = {-ly^^n {exp(zx)} 

If c COS X > 1, then ip„{exp{ix)) > 0. 



If X G [0, arccos 1/c], n » 1 then 



|i/'„(exp(!x))|T„(c) 



1+0(1), 



[c+ Vc^ cos^ x-lj 

(6.5f) ip„{exp{ixi)) = o(i/^„(exp(zx2)) for < X2 < arccos 1/c, X2 < Xi < 71 - X2. 
Using Property (|6.5a|) , we can write 



(6.6) 



P 



£-1 

2 



1/^^(1) + 2 COS ^i/'„(xO 



7=1 



Since cos < lis monotonically decreasing as a function of m for <m < 



we see: 



Theorem 6.3. For sufficiently large even n, Nn,q < N„,o- 

Proof. This is an immediate consequence of the monotonicity of cos, equation (|6.6|) 
and Properties (|6.5a|) , ( (|6.5c|) , (|6.5d|) and (|6.5f|l above. □ 
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For q mod p, the term largest in absolute value in the sum (aside the 

J . p-1 

term) on the right hand side of eq. (|6.6|) is the ip{X~) term, so if we assume that n is 
even, then the next largest (after N„,o) term will be Nn,p-2 (since (p - 2)[(p - 1)/2] = 1 
mod p), then Nn,p-i, and so on. For n odd, the ordering is reversed. 

7. An EXTENSION AND LIMITING DISTRIBUTIONS FOR GRAPHS 

An inspection of the proof of Theorem IS.ll reveals that in order to show that for 
a sequence of probability distributions {P„(x)} on Z, the distributions {Pn(x/ V^)} 
converged to a limiting normal distribution with mean 0, we used the following 
conditions (we will state them in a univariate setting for simplicity; the multivariate 
case is the same): 

Condition 1. The characteristic function of {P„} has the form 

XiPn) = + 0(1), 

where //(0) is twice continuously differentiable at 0, so that fj{0) = Uj + bjO + CjO^ + 
o{d'). 

Condition 2. 

fll = 1, &2 = 0, C2 < 0. 

Suppose now we generalize the setting of Section [T] as follows: 
Let Qhe a connected r-regular non-bipartite graph, directed or not, (possibly 
with self-loops and multiple edges), on k vertices. Let Vi and V2 be two vertices of 
Q. Consider now the set of all closed walks (circuits) of length N on Q. Let 
f • ^{0) — > be a function assigning a weight to each vertex of Q, and define a 
random variable Xf to be Y^i^i i{vi) for w = Vi, . . . ,Vm & W^. What can we say about 
the distribution of Xf? It turns out that asymptotically we can say a lot. First, 
however, define 

and fo = f - /i(f)l. Define further the Laplacian A(^) of Q to be A(^) = rl - A{0), 
and define Ao(^) to be A(^) viewed as an operator on the orthogonal complement 
to 1 (that is, vectors with sum). Let Pm(^) be the distribution of Xf on W^. 

Theorem 7.1. The distributions Pn{{x - N/.t(f))/ VN) converge to a balanced (that is, 
mean 0) normal distribution with variance 

(7.1) a\i) = \ [-llfoll^ + 2rf^A-i(^)fo] = \ [f^(-Io + 2rl^-,\Q))U] ■ 
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Proof. Exactly as in Section [T] we construct a generating function gj^ for Xf on W^. 
To do this, let A be the adjacency matrix of and let 



/'jf(i'i) 



Dkix) = 



X 



X 



Then, 



g^ix) = tr {D,{x)Af = ^ (Dfc(x)A), 



where Ai, . . . , Ay are eigenvalues, and, just as in Section |5l we have A^(Pn)(0) - 
gN{exp{id))/cN, where 

it 

CN = |WN|=2^Af(A). 

Since ^ is an r=regular, non-bipartite graph, it has a unique eigenvalue of maximal 
modulus, and that eigenvalue is Ai = r. 

Now, we can directly apply Conditions 1 and 2 (and accompanying comments) 
above, and the results of Section [10] (noting that Assumptions 1-4 hold) to obtain 
the desired result (in particular, the estimate needed in Condition 2 is precisely 
Theorem |10.8|| . We replaced the resolvent in formula (|10.9|) by the equivalent (by 
the discussion in the beginning of Section [lO]) Laplacian form, since that is more 
common in graph theory □ 

Remark 7.2. If the vector f is an eigenvector of A^A with eigenvalue r^, the corre- 
sponding variance is equal to zero. By Remark |10.9l this will not happen, eg, if G is 
a connected non-bipartite undirected graph, but it does happen for general directed 
graphs; see the discussion of the directed line graph in Section HI 

The above remark leads to the following question: 

Question 73. What combinatorial property of an r-regular directed graph G is 
reflected in the algebraic statement that the operator norm of Aq{G) is equal to r? 



A slight change in notation transforms Theorem 17. 1 1 into a central limit theorem 
for distributions over closed orbits of primitive irreducible Markov processes over 
a finite number of states - the irreducibilty is exactly equivalent to the connectivity 
of the graph Q above. For ease of reference we state this as a separate theorem. 
The notation for f, y., etc, is as before; the space Wn is now a probability space with 
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the obvious probability measure; P = PHs the transition matrix (note that Remark 
I7.2l remains valid in this setting as well). 

Remark 7 A. Let P^ix) be the distribution of Xf on W^. Then P^iix - Nju(f))/ VN) 
converge to a balanced (that is, mean 0) normal distribution with variance 

(7.2) oHf) = \ [-llfoll^ + - Po)-'fo] = \ [f^(-Io + 2r(Io - Po)-^)fo] • 

Remark 7.5. We have actually shown a slightly stronger result: instead of the 
trace (distribution over cycles), we could have considered the z'/'-th element of 
P. Since the principal eigenvector varies continuously under perturbations (see 
l|26l Chapter II.4.1]), we could have replaced our sample space as above by 
the space Cn of paths of length N joining the z'-th to the /-th vertex. An easy 
computation shows that the covariance is the covariance given in equation 17.21 
divided by a further factor of k. The same remark applies to Theorem 17. II 

7.1. Distribution modulo a prime. Theorems l7.1l and l7.4l have particularly simple 
analogues if the function / we are studying is integer valued, and we are interested 
in the distribution of the Z/pZ-valued random variable Yf{n) which assigns to 
each cycle of length n the sum of the values of / modulo p. In that case, under 
the assumption that the adjacency matrix A (in the context of Theorem 17. 1[) or the 
transition matrix A (in the context of Theorem I7.4[) is irreducible and primitive 
(the last two A(Xu(G)) conditions guarantee that A has a single eigenvalue Aq of 
maximal modulus, the eigenspace of Aq is one-dimensional, and the orthogonal 
subspace is invariant under A), then we see that the distributions Vn of Yf{n) 
approach the uniform distribution (on Z/pZ) exponentially fast in n (though a 
more reasonable measure of the speed of convergence is the size of W„, in which 
case the convergence is polynomial). This statement follows from the: 

Lemma 7.6. If A is a matrix satisfying the conditions above, then the spectral radius ruA 
ofUA,for U any non-trivial unitary matrix such that the top eigenvector of A is not also 
an eigenvector of U, is strictly smaller than that of A (r^)- 

The proof of the lemma is immediate. 

In our case, the matrix U is the diagonal matrix U{x) with Ujj = Xp> with Xp a 
non-trivial p-th root of unity. The speed of convergence to the uniform distribution 
is given by (max;^,p=i r{U{x)A))/r{A). 

8. Functions on edges and distributions over paths without backtracking 

In this section we consider two kinds of questions, which are seen to be inti- 
mately related. The first is: 
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Question 8.1. Let / be a function on the edges of G. How are the averages of / over 
long cycles or paths in G distributed? 

The second question is: 

Question 8.2. Let / be a function on the vertices of G. How are the averages of / 
distributed over long cycles in G without backtracking - such cycles are more closely 
related to, eg, geodesies on surfaces, then arbitrary cycles. 

Both questions can be answered at the same time by constructing the directed 
line graph (or line digraph) of G. This construction can be performed for either 
a directed or undirected graph G; In section 18.11 we will derive the results for 
undirected graphs in detail, whilst in section |83l we will discuss the directed case 
somewhat more briefly (since the technical details are essentially identical). 

8.1. The directed line graph of an undirected graph. The directed line graph of 
G, denoted by X(G), is constructed as follows: The vertices of X(G) are edges of 
G labelled with a + or a -; that is, to each edge e of G there correspond vertices 
C- and e+ of X(G). These correspond to the two possible orientations of e: if the 
vertices of e are v and w, then we say that v is the head of e_, and w the tail (and 
write V = h{e-), iv = i(e_)), while for e+ this nomenclature is reversed. Two vertices 
Vi and V2 of X(G) are joined by a (directed) edge if the head of Vi is the same as the 
tail of V2, except that e_ is never joined to e+, and vice versa. We now make some 
observations and definitions. 

Definition 8.3. Let f be a function defined on the vertices of a graph G. We say that 
a function g defined on the vertices of X(G) is the gradient of f, and write g = '^f if 
g{e) = f{h{e)) - f{t{e)). 

Definition 8.4. We can identify functions on the vertices ofG with (a subset of) functions 
on the the vertices of X(G). To wit, if a f is a function on the vertices of G, we let 
£f{e) = fm). 

Observation 8.5. There is a natural correspondence between walks on X(G) and 
walks on G without backtracking. Indeed, passing through a vertex e of £.{G) 
corresponds to going from t{e) to h{e). Since e+ is not connected to e_ for any 
e G E(G), any such walk is automatically without backtracking. Similarly, a cycle 
on X(G) corresponds to a tailless cycle without backtracking on G. 

If G is an r-regular graph, then X(G) is r - 1-regular, in the strong sense: each 
vertex of X(G) has in-degree and out-degree equal to r - 1 (thus the total degree is 
2r - 2), and from the above Observation 18. 5[ X(G) is connected if and only if G is. 
It follows that the adjacency matrix A(X(G)) of X(G) is an irreducible nonnegative 
matrix, all of whose row and column sums are equal to r - 1. It follows that the 
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space of functions on the vertices of X(G) orthogonal to the vector 1 is an invariant 
subspace of A(X(G)) and of A*(X(G)) - we will, as before, denote the two matrices 
restricted to this subspace by Aq and A^, respectively; the algebraic and geometric 
multiplicities of the eigenvalue r - 1 is equal to 1, by standard Perron-Frobenius 
theory Despite this, it turns out that A'A is spectacularly degenerate. Indeed, the 
ij-th entry of A* A is equal to the number of vertices of X(G) adjacent simultaneously 
to the z'-th and the /-th vertex. It follows that the n'-th entry of A* A is equal to r - 1, 
while the ij-th entry is equal to r - 2 if the corresponding directed edges of G have 
the same tail, and is otherwise. It follows that 



(8.1) A'A = l2EiG) + {r-2) 



h 



Jv{G)J 

where the last term contains V{G) rxr blocks, each of which is the matrix of all Is. 
We thus have the following observation: 

Observation 8.6. The spectrum of A* A has the following form: The eigenvalue 
(r - 1)^ occurs V{G) times, and the corresponding eigenvectors are given precisely 
by X/ for arbitrary functions / on G (the Perron eigenvector corresponding to 
the constant function), while the eigenvalue 1 occurs 2E(G) - V{G) times. The 
eigenvectors are those functions on the directed edges of G, for which, for all 
vertices v of G, the sum of values on all the edges leaving v is equal to 0. 

Corollary 8.7. The operator norm o/Aq is equal tor -1. 

Consider now the Laplace operator on X(G): A^^g) = {r - 1)^ - A(X(G)). We will 
need the following in the sequel: 

Theorem 8.8. Let £,._i be the eigenspace of {r - 1)^ for A' A. If V*{G) is the space of 
functions on the vertices ofG, then 

(a) : 

£,_i = X(r (G)), 

(b) : 

Ax(G)(E.-i) = V(y*(G)), 

(c) : V(V*(G)) n E,._i n 1-^ = 0, unless G is bipartite. 

Proof. Part (a) is the content of Observation 18. 6[ Part (b) is a corollary of Part (a). 
Indeed, Ax(g)(/)(x) = (r - l)/(x) - Y.h(x)^t{y) fiy)- M / = Xg, then 

(8.2) Ax(G)(/)(x) = {r- l){g{t{x)) - g{h{x))), 

since all the y adjacent to x have the same tail, equal to the head of x. 
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To show Part(c), suppose V(V*(G)) n E^-i 0. Let g be in the intersection, 
and k be such that V(?c) = g. It follows that for any x, y such that t{x) = t{y), 
g{x) = g{y). We see that k{h{x)) - k{t{x)) = k{h{y)) - k{t{y)), which implies in turn 
that k{h{x)) = k{h{y)). So, k is the eigenvector of the eigenvalue of the Laplace 
operator on G, and hence is constant, unless G is bipartite. □ 

We end this section with a remark necessary to compute distributions, as done 
in the following Section ing To wit: 

Remark 8.9. The adjacency matrix of the line graph of a non-bipartite graph G is 
primitive. That is, there is only one eigenvalue on the circle of radius r - 1 in the 
complex plane, and that is r - 1. Its geometric multiplicity is 1. 

Proof. Doubtlessly there are simpler arguments, but we choose to use the results 
(described in [59J) on the Ihara zeta function Z of G, which can be expressed as a 
determinant in two ways: 

The first way (original theorem of Ihara lEH) is: 

(8.3) Z-\u) = (1 - -1 det((l + {r- 1^)1 - uA), 

with A the adjacency matrix of G, and % the rank of the fundamental group of G. 
The second way (due to Hyman Bass ||2 is): 

(8.4) Z-\u) = det(I - uM), 

where M is the adjacency matrix of the directed line graph of G 

The equality of the two expressions implies that v is an eigenvalue of M if and 
only \iv + {r-l)lv is an eigenvalue of A (we are ignoring the eigenvalues ±1, which 
occur with large multiplicity in the spectrum of M). Suppose that v has modulus 
r-l,sothatz; = (r-1) exp(z0), for some 0. It follows that w = exp{id)+{r-l) exp{-id) 
is an eigenvalue of A, and since A is symmetric, 0g{O, ti}. If0 = O, u = r- l, while 
iiO - n,v = -{r - 1), but then w = -ris an eigenvalue of A, and so G is bipartite. 

The statement about the multiplicity of the eigenvalue r - 1 is immediate, since 
■£{G) is clearly strongly connected. □ 

We include the following observations both for the sake of completeness, and 
in view of Lemma [8]14]below. 



Lemma 8.10. 



AX = (r - 1)V. 
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Proof. Indeed, £{f){x) = /(f(x)). Further, 

(8.5) AX(/)(x) = /(^(^) - = - - Mx)) = V(/)(x). 



t(y)=/!W 



□ 



Lemma 8.11. For any f,g & V*{G), we have 
Proof. Indeed, 

u/y^g = Y,mmgm) - gihm 

X 

(8.6) veV{G) w adjacent to v 

= fmigm 

veV(G) 
= f^g- 



□ 



Consider now a function g on the directed edges of G. How do we decompose it 
into a gradient and a function orthogonal to gradients? First, we note that a basis 
of the gradients is formed by the gradients of 6 functions: 



(8.7) 6,(x) 
So that 

(8.8) V6^(x) = < 



fl X -V, 
[0 otherwise. 

1 t{x) - V, 
-1 h{x) = V, 
otherwise. 



The functions VSp form a basis of V{V*{G)), though not an orthonormal one. 
Now, note that 

t(x)=v h{y)=v 

In other words. 

Lemma 8.12. g is orthogonal to the gradients, if and only if the sum of g over the edges 
coming into any vertex v is equal to the sum of g over the edges leaving v. An equivalent 
condition is that V*g - 0. 
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One may ask: what is the orthogonal projection of a given X/ onto the gradients? 
The following comes out of an easy computation: 

Observation 8.13. The orthogonal projection of X/ onto the set of gradients is VA/. 

8.2. Applications to distribution. We can use the results of the previous section 
to understand the limiting distribution of functions defined on (directed) edges 
of G. Indeed, we can use Theorem 17.11 in the form corresponding to Eq. 110.101 to 
observe that 

(8.9) oHi) = ^f^(A-i)^((r - 1)^1 - A{£{G)yA{£{G))A-,H 

for f any function on the directed edges of G, and Aq the restriction of the Laplace 
operator on X(G) to the subspace of 0-sum vectors. 

Lemma 8.14. The right hand sidef of equation W^ vanishes precisely when f is the gradient 
of a function on the vertices ofG. 

Proof. Let f = Aw. By Observation 18.61 we see that the right hand side of Eq. 18.91 
vanishes precisely Hue X,{V*{G)). By part (b) of Theorem 18.81 it follows that this is 
so if and only if f G V{V*{G)). □ 

One direction of the above lemma is just common sense, since the sum over any 
cycle of a gradient is equal to 0. 

Keeping the above in mind, we note that a simpler form of the covariance is 
given by Theorem 17.11 

(8.10) a\i) = '-^[i\l-l{r-l)A--')i] 

For functions on the vertices of G, the above assumes the form: 



(8.11) o\i) = [f^r (l - 2{r - l)A-i) Xf] 

8.3. The line graph of a directed graph. The construction of the line graph of a 
directed graph G is essentially the same as that of an undirected graph. This time, 
the vertices of X(G) without labels (so X(G) has E(G) vertices). The operators V 
and X are defined as in Section 18.11 We have an observation even simpler than 
Observation 18.51 

Observation 8.15. There is a natural bijective correspondence between walks on 
X(G) and walks on G. 
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If G is an r-regular directed graph (by this we mean that both the in- and out- 
degree of each vertex is equal to r), then so is X(G); by Observation 18.151 £.(G) is 
connected whenever G is. As before, A(X(G)) is the adjacency matrix of X(G). we 
can compute: 



(8.12) 



}v{G)J 



where each block corresponds to the set of edges of G emanating from a given 
vertex. From this we have: 

Observation 8.16. The spectrum of A\£,{G))A{£,{G)) has the following form: The 
eigenvalue occurs V{G) times, and the corresponding eigenvectors are given 
by X/ for arbitrary functions / on G (The Perron eigenvector corresondtng to 
the constant function) while the eigenvalue occurs E(G) - V{G) times. The 
eigenvectors are those functions on the edges of G for which the sums of the 
values over all edges leaving a vertex v is equal to (for all v). 

Corollary 8.17. The operator norm o/Ao(X(G)) is equal to r. 

The Laplace operator on X(G) is defined as: Ax(g) = rl - A(X(G)). 
We have 

Theorem 8.18. Let Ey he the eigenspace of r^ for A' A. If V*{G) is the space of functions 
on the vertices of G, then 



(a) : 

(b) : 

We also include 



E, = X(r (G)), 
Ax(G)(E.) = V(r(G)), 



Remark 8.19. The adjacency matrix of the line graph of G is primitive if the adjacency 
matrix of G is. 

Proof. We use Observation 18.151 and Theorem |15.1| to note that the non-zero eigen- 
values of G are exactly the same as those of X(G), since det(J - uA{G)) = det(J - 
uA{£{G))). □ 



Lemma 8.20. 



AX = rV. 
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Lemma 8.21. For any f,g & V*{G), we have 

ufyyg = fAg. 

Lemma [8. 121 and Observation 18. 13l go through without change. 

The results of section 18.21 go through essentially without change. Since some 
constants change we restate them here. First, let / be a function defined on the 
edges of X(G). We see that: 

(8.13) o^f) = ±i^{A-^yrh - A{£{G)yA{£{G))A-H 

Lemma 18.141 holds as well, and this gives us the following useful corollary (a 
homological condition) about distribution on G itself: 

Theorem 8.22. The variance of a function f on the vertices ofG vanishes, precisely when 
there exists a function g, such that X/ = Vg. 

Finally, we have a version of formula 18.101 

(8.14) c7^(f) = l[f'(l-2rA-^)f] 

9. Distribution in compact groups 

The methods of the section 17.11 can be adapted to the following setting: Let G 
is a graph, and T be a compact topological group. Label the z'-th vertex of G with 

ti 6 T. Now, associate to each cycle c = Ui, . . . , U;c on G the element tc = tk t^ eT. 

We ask: as c varies over the cycle space Wjv, how are the elements tc distributed in 
T (with respect to the Haar measure). The answer is given by the following: 

Theorem 9.1. If the graph G is as before (connected, non-bipartite), the closed subgroup 
generated by the tj (i = 1, . . . ,k) is equal to T, and the elements tj do not all lie in the same 
coset with respect to a one-dimensional representation of T, then the elements tc become 
equidistributed, asN ^ oo. 

Proof. As before, the equidistribution is equivalent to the assertion that for a non- 
trivial irreducible unitary representation p, 

(9.1) ^P^^^^^ = ''(l^"!)- 

ceW„ 

This follows from the Fourier transform formula for compact groups; see [[TT| for 
the finite case, p63| for the general compact topological group case. See also [ST] 
Now, let U{p) be the k deg p xk deg p block-diagonal matrix whose j-th block is 
just p{tj). Further more, as before, let A(G) be the adjacency matrix of G, and 
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A/(G) = A(G) (8) 1/ (where I; is the I x I diagonal matrix: in other words, A/(G) is a 
kl X kl matrix, obtained from A{G) by replacing each element fl,y hy akxk matrix 
Mij, all of whose elements are equal to Uij. It is not hard to see that the left hand 
side of Eq. 19.11 is equal to tr {U{p)AdegpiG))'^ , and so it suffices to show that the 
spectral radius of Mp = U{p)AdegpiG) is strictly smaller than the spectral radius of 
A(G) (which we normalize to be equal to 1 by scaling) under the hypotheses of the 
theorem. Suppose not. Since iU{p)) is unitary, the worst that can happen is that 
there exists a unit vector v, such that ||Mp(u)|| = 1. If that is so, v is contained in the 
eigenspace of eigenvalue 1 of Adegp. In such a case, v = Vi®u, where u G V{p), and 
Vi is an eigenvector of A(G) with eigenvalue 1. livi = {v\, . . . , y"), then v\u must be 
an eigenvector of piU), for all i. Since v[ Vz, this implies that u is an eigenvector 
p{ti), Vz. Since p is irreducible, this implies that either the elements t-[,...,tk do not 
generate all of T, or p is 1-dimensional, in which case clearly p{ti) = p{tj), Vz, 
which proves the theorem. □ 

Remark 9.2. As in Remark |7.5[ the above argument also works if we pick all paths 
between the z'-th and the /-th vertex of G, instead of all cycles. 

10. Some perturbations and estimates 

Consider an analytic family of linear operators M(x), acting on R*^, with M(0) = 
M, and let A be a simple eigenvalue of M. Then, if 

Mix) =M + M^^h + M^^h^ + ..., 

perturbation theory (see ||26l page 79, (2.33)]) tells us that 

A(x) = A + A^^h + A^^h^ + ..., 

where 

(10.1) A(^^ = tr M^i^Pa, 

(10.2) A(2) = tr [m(2)Pa - M^^^SaM^^^Pa] , 

where Pa is the projection onto the eigenspace of A, while Sa is the reduced resolvent 
of M at A, which is the holomorphic part of the resolvent of M at A, defined by the 
properties 

(10.3) SaPa = PaSa =0; (M - AI)Sa = Sa(M - AI) = I - Pa, 

(in other words, Sa is the inverse of M- AI restricted to the orthogonal complement 
of the eigenspace of A), and thus 

(10.4) MSa = I - Pa + ASa. 
Now we will specialize a bit: 
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Assumption 1. The eigenvalue A is such that the constant vector 1 spans the 
eigenspace of A. 

In this case. Pa = }k/K where we recall that }k is the kxk matrix of all Is. 
In addition. 

Assumption 2. We will assume that M(x) = D{x)M, where D(x) is an analytically 
varying diagonal matrix, D{x) = D + D^^^x + D^^^'x^ + . . . , where we say that the 
diagonal elements of D^'' are d^'^ = {df, . . . , rf^'^). 

Lemma 10.1. Let A = (A,y) be annxn matrix. Then 

trA}„ = Yj ^ij- 

l<i,j<n 

Lemma 10.2. Let A = (A,y) he annxn matrix, and let X he annxn diagonal matrix. 
Then 

{XA)ij = AijXii, 
{XAX)ij = AijXiiXjj. 

Lemma 10.3. Let Dhe a diagonal matrix, with diagonal elements d^,. .. ,d„. Then 

n 

v*Dv = Yj di^r 
/=i 

The proofs of the above lemmas are immediate. 

Lemma 10.4. Let Py is the projection operator on the suhspace generated hy v (a unit 
vector). Then 

\xMPy = v*Mv. 

In particular, ifv is an eigenvector ofM with eigenvalue X, then tr MP^, = A\\v\\. 

Proof. This follows by a direct computation, since when Z7 is a unit vector, {Pv)ii - 

ViVj. □ 

Lemma 10.5. Ifv is an eigenvector ofM with eigenvalue A, then MP-o = APy 

Lemma 10.6. Suppose that A has multiplicity 1, and v{A) is a unit vector generating the 
eigenspace of A, and M{t) = D{t)M, where D{t) is a diagonal matrix. Then 

A'(M) = Ai;*(A)D'i;. 

Proof. By Formula (|10.1|) , we have 

A'(M) = trM'P„(A) = v'{A)M'v{A) = Av'{A)D'v. 

□ 
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Corollary 10.7. In the case when v{A) 
fracl y/kmathbfl, we have: 



(10.5) 



k ' 

/=1 

To compute the second derivative of A, we use the formula (|10.2[) (we are as- 
suming that A is an isolated eigenvalue with eigenvector v{A), and M{t) = D{t)M, 
as before): 

A" =tr [M"P„(a)-M'SaMTa] 

= Av'D"v - tr [M'SM'PaI 
= Av'D"v - Atr [D'MSAD'P;^] 
= Av' [D" -D'MSaD']u. 
We can now use the formula (|10.4|) to get: 



(10.6) A" = Av' [D" - D'(I - Pa)D' - XD'SxT)'\ v. 

In the special case where the eigenvector v is proportional to 1, we can rewrite 
the formula in coordinates in a simple way. To wit, any diagonal matrix D can be 
written (uniquely) as Dq -l-dl, where Dq is such that tr Dq = 0. A simple computation 
then shows that 



(10.7) 



^ =k 



.7=1 ;=i 

The case we are interested in is still more special, and that is where 
Assumption 3. 

'exp(z/ix) \ 
exp(f/2x) 

D{x) = 

expiifkx), 

Here, d« = {ifiAfi, . . . Ah), while d^^) = -i(/2,/2^. . .,/2), and so, letting f = 

(/l,...,/ic), 

(10.8) 



A<^) = ^ 
k 



+ \\fof + Af^SJ 



where, as before, fo is the component of f orthogonal to constants. 
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To show our final estimates we shall need 

Assumption 4. The matrix M is A > times a doubly stochastic matrix (this 
implies that the operator norm and the spectral radius of M are both equal to A). 

Theorem 10.8. With assumptions as above, and, in addition, f = fo (that is Tj%i fj = 0), 
then A*^) zs nonpositive. 

Proof. Since f = fo. Equation (|10.8|) can be rewritten as 



(10.9) A(2) = - A [-||fo||2 _ 2Af^SAfo] = [fo (-1 - 2ASa) fo] • 

If we regard S,\ as an operator on the orthogonal complement to 1, then by equa- 
tions (|10.3|l and (|10.4[) , Sa(AIo - Mo) = -lo- Let v = -SAfo- Then the term in square 



brackets in Eq. |10.9l can be rewritten as: 
(10.10) i;*(AIo - Mo)* (-1 - 2ASa) (AIo - Mo)i; = Vt (a\ - M^Mo) v, 

where we have used the fact that for any matrix A and any vector v, v^Av — 
v^A^v. The quadratic form A^Io - MgMo is positive semi-definite, since the biggest 
eigenvalue of the symmetric matrix MqMo is equal to the square of the operator 
norm of Mo, which, in turn, is no greater then A, by Assumption 4 (since M'M is 
A^ times a doubly stochastic matrix). □ 

Remark 10.9. In the statement of Theorem llO.81 the word "non-positive" can be im- 
proved to "negative" under the further assumption that M is irreducible, primitive, 
and normal. 

Proof. Since the orthogonal complement to the subspace generated by the vector 
1 is invariant under M, it follows that Mq is also normal, and so its operator 
norm is equal to its spectral radius y.. Under the assuptions of irreducibility and 
primitivity, Perron-Frobenius theory tells us that < A. □ 

11. Topological entropy 

Consider a graph G, and consider a positive function / on its vertices. For each 
cycle c we let F(c) to be the sum of values of / over c, and we want to know 
how many c are there for which F(c) < L. We denote that number by N{f, L), 
and we ask ourselves how N{f, L) behaves asymptotically as L tends to infinity. 
To understand N{L,f), we consider first the matrix U{f) = D{uf^, . . ., uf")A{G). As 
before, we observe that the coefficient of u'' in tr U"{f) is the number of cycles of 
(combinatorial) length n, for which F(c) = r. Write a formal series 



Lif,u) = Y,irW{f). 
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This series converges for sufficiently small u, and can there be written in closed 
form as L(/, u) = tr (I - U{f))~^, from which it follows that the exponential rate of 
growth of N(c) is equal to negative logarithm of the radius of convergence of L{f, u) 
- we call this the entropy of G,f - which, in turn, is equal to the smallest positive 
real value of u, such that the spectral radius of U{f) is equal to 1. Since it is more 
convenient to deal with analytic functions (which L(/, u) is not, for arbitrary real 
values of fi, so we write u = exp -s, and now ask for the abscissa of convergence 
of L(/, exp -s). This will give us the entropy In this section we use perturbation 
methods in a rather straightforward way to get explicit information on the entropy 
Let A be an n X n non-negative primitive irreducible matrix. Let /i, ...,/„ be a 
collection of weights. We then define the matrix E(s, f) to be the diagonal matrix 
whose n'-th element is equal to exp{-sfi). Define M(s, f) to be M(s, f) = E(s, f)A. 
We are interested in p(s, f ): the spectral radius of M(s, f ). By Perron-Frobenius 
theory we know that there is a real eigenvalue of M(s, s) equal to p(s, f ), and the 
eigenvector Vp of this eigenvalue is positive. 

Lemma 11.1. 

(11.1) ^ = -pV^pDif^,...Jn)v. 

For positive f, ^ < 0. 

Proof. This follows immediately from Lemma 110.41 and the positivity of p and 

Vp. □ 

Lemma 11.2. We have the following expression for the gradient of p with respect to f: 

(11.2) Wip = -sp{vl,...,vll 
where Vp = {vi,...,Vn). 

Proof. We note that 

f^ = -sD(0,...,l,...,0)M, 
where the 1 is in the z'-th place. Thus, by formula (|10.1|) we have 



^ = -sv^DiO, . . . , 1, . . . , 0)Mv = -spv]. 



□ 



This can be restated as saying that the derivative of p in the direction of a vector 
g is equal to -psv^pD{^v. 

This gives us the following important corollary: 
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Corollary 11.3. Consider deformations g keeping the sum of fixed. Then the critical 
points of p occur precisely for those pfor which \vi\ = \Vj\,for any i, ]. 

We can also compute the second directional derivative of p. Indeed, let g = 
(gi, . . . , g„) be the direction vector, so that we want to compute the second de- 
rivative with respect to t of p(s, f + fg) at ^ = 0. To do this, we use the formula 
(ITa2l) : 

(11.3) p" = tr [m"P,,(p) - M'SpM'Pp] . 
Note that (as in the proof of Lemma [11 .21) 

(11.4) M' = -sD(gi,...,g„)M, 
while 

M' f = s'Digl, . . . , gl)M, 

and so 

(11.5) tr M"P„(p) = s^pv^Digl gl)v = s^p {D{g,, . . . , g„)vf {D{g,, gM . 

To understand the second term of the right-hand side of Eq. (|11.3|) , first note that 
(by Eq. dmi)) 

M'SpM'Pp = Dig,, gn)MPp = ps^Dig,, g„)MSpDig,, gn)Pp, 
where the second equality is by Lemma [10. 5[ Now 

(11.6) tr M'SpM'Pp = psMpyD{g,,...,gn)MSpD{g,,...,gn)v 

(11.7) = ps' {Dig,, gM' MSp {Dig,, gM . 
Putting together Eq. (|11.5|) and Eq. (|11.6|) , we see that 

(11.8) p" = ps' {Dig,, gn)v]' (l - MSp) {Dig,, gn)v] 
Using the formula (|10.4[) equation (|11.8|) simplifies further to: 

(11.9) p" = ps" {Dig,, gn)v]' [P,(p) - pSp) {Dig,, gn)v] 

The following lemma is not surprising: 

Lemma 11.4. The quadratic form given P-o - pSp is positive-definite. 

Proof. On the span of v, the projection operator P^ is equal to the identity, whilst 
the reduced resolvent Sp vanishes. On the orthogonal complement, the projection 
operator vanishes, so since the Perron-Frobenius eigenvalue p is positive, we 
need to show that Sp is negative definite. Consider a vector w, in the orthogonal 
complement of v. Such a w is equal to (pi - M)z, for some z orthogonal to v. So, 

w*SpW = z\pl - M)z, 
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So, it will suffice to show that {pi - M) is negative-definite. Suppose not. Then 
there exists a Zq, such that ZqMzq > p||zo|p. By the argument in the proof of theorem 
110.81 we see that ||Mzo|| < p||zo||. So, ZqMzq > p||zo|p implies that {zo,Mzo) > p||zo|p, 
and hence that Zq is an eigenvactor of M with eigenvalue p, which is impossible 
by assumtion that M is irreducible and primitive. □ 

We finish with 

Theorem 11.5. Let So(f) be the unique s such that p(so, f) is equal to 1. Then Sq is a convex 
function ofi, and hence assumes a unique minimum on each linear subspace of values of 
f. In particular, if we restrict to the the subspace Fq, where the sum of the values ofoff is 
equal to 1, then the minimum is achieved at the point where 

log(Al), 

in which case the entropy is equal to X log(Al);. 

Proof. The convexity of Sq follows from Lemma [11. 41 and Lemma [11.11 The point at 
which the minimum is achieved is computed easily using Corollary II 1.31 as is the 
value of entropy. □ 



12. Applications to Groups and other objects 

The asymptotic results in the previous sections apply directly to the question 
of the growth of homology classes in the free groups, and give in some sense 
complete information: 

Observation 12.1. We see that the asymptotic order of growth of any two fixed 
homology classes is the same. 

Observation 12.2. Theorem 17.11 shows in particular that a random long cycle is 
equidistributed among the vertices of a regular graph. 

Observation 12.3. We see that the order of growth the number of words length n in 
any fixed homology class in f;c is asymptotic to Ck{2k - 1)" /n^l'^, where c^ is easily 
computed using the expression for a in the statement of Theorem 15. 1[ keeping in 
mind that 

k 

yflk^ 

where c is the parameter in the statements of theorems of the last two sections. 
Alternately, Theorem 17. 1 1 can be used. 
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We can compute other growth functions. For example, let h : F„ ^ Z be the 
"total exponent" homomorphism, i.e. if F„ =< ai,...,a„ >, then h{ai) = 1. We see 
that the generating function for the preimages of / G Z is given by 

(2 V2n - if R,{—^=; x,...,x) = il ^In - if R,{—^=; x). 
^ ' V2n-1 ^ ' V2n-1 

Observation 12.4. Instead of cyclically reduced words, it is perhaps more natural to 
study conjugacy classes (ordered by their cyclically reduced length). It seems futile 
to seek any enumeration as neat as Theorem 12.31 however, since the relationship 
between the number Cjt of conjugacy classes of words of length k and the number 
of cyclically reduced words 'Wk is: 

(12.1) a = -p + o(V^), 

it is clear that the asymptotic results are the same for the two problems. For more 
on this subject, see Section [13] and the sequel. 

Observation 12.5. Counting conjugacy classes is a problem closely related to that of 
counting closed geodesies on manifold. In the context of compact hyperbolic sur- 
faces, it was observed by P. Sarnak (see, for example, [54]) that among all geodesies 
shorter than L, null-homologous geodesies are more numerous than those in any 
other prescribed homology class (that is, while the ratio of the two quantities ap- 
proaches 1, the difference is asymptotically positive). The results of the current 
note provide a certain justification for this, since any limiting distribution likely 
to arise in this context is, for reasons of symmetry, likely to be unimodal, with the 
mode at 0. Certainly this is true of the normal distribution, though even in this 
case, a careful analysis of the error terms is required. 

13. Counting coniugacy classes 

Consider a finitely presented group G. Let g be an element of G. We define 
the reduced length of g - denoted by \g\ - to be the length of the shortest word 
in the generators of G representing g. We define the length up to conjugacy of g 
- denoted by \g\c - to be the minimum of \h\, the minimum being taken over all 
group elements h conjugate to g. Length up to conjugacy is obviously invariant 
under conjugation, and we will also use the term to apply to conjugacy classes. 

NG{r) = \{geG\ \g\ = r}\, 
CGir) = \{geNG{r)\\g\c = r}\, 
CCcir) = \{C e G/conjugacy | |C|, = r}\ . 
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The subscript G will be omitted whenever the group G is obvious from context. 
Given a sequence A = Uq, . . . ,ai, . . ., we can define a generating function T[A], by 

oo 

There is frequently confusion as to whether the generating function is a holomor- 
phic function or an element of the ring of formal power series. In this section 
"generating function" will mean a function analytic at G C. 

The three counting functions above give rise to corresponding generating func- 
tions T[Ng], T[Cg], T[CCg]- Our real interest will lie in the last of these; the first 
one has been the most extensively studied, and the result most relevant to us is: 

Fact 1. If G is an automatic group, then the generating function !F[A^g] is a rational 
function. 

For definitions and properties of automatic groups, see ||9]]. 

Fact 2.(Gromov, Epstein) If G is an automatic group, then the generating function 
!F[Cg] is a rational function. 

Facts 1 and 2 might lead us to expect that T[CCg\ iS/ likewise, rational, but in 
fact the opposite seems to be the case, and we are led to: 

Conjecture 13.1. Let Gbe a word-hyperbolic group. The !F[CCg] is rational if and only 
ifG is virtually cyclic (elementary in the terminology o/|[T2|). 

In the sequel, this conjecture is supported by the complete analysis of the case 
where G is F|c - the free group on k generators. 

14. Growth functions for free groups 
Let Fk be the free group on k generators. The following is obvious: 

Fact 3. Np.ir) = 2k{2k - iy-\ 
Theorem [LT] says that 

Cf,(r) = {2k - ly + l + {k- 1)[1 + (-1)'']. 

Corollary 14,1. 

In order to compute CCFt(r) it is enough to notice the following: 
Theorem 14.2. 

rCC{r) = Yj(P{d)C{r/d), 

d\r 
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where (p denotes the Euler totient function. 

Proof. The theorem is a trivial consequence of Burnside's lemma, stated below as 
Theorem 114.31 for convenience, applied to the action of the cyclic group Z/(rZ) on 
the set of cyclically reduced words of length r. □ 

Theorem 14.3. Let G be a finite group acting on a finite set X. For g e G let xp{g) 
denote the number ofxeX, such that g{x) = x. Then the number of orbits ofX under the 
G-action is 

We now have the following general observation: 

Theorem 14.4. Suppose we have three sequences A = {aj, B = {bj}, and C = {Ck}, 
satisfying 

a„ = Y^cabn. 

Then 



r[A]iz) = Y,Cdr[B]{x'). 

Proof. On the level of formal power series, the statement is clear by expanding 
the left hand side. Otherwise, if the radius of convergence of f^iA] is r^, then the 
radius of convergence of Gd[A], defined as Grf[A](z) = T[A\{z'^) is, by Hadamard's 
criterion, equal to r\^'^, so all of Gd[A\ converge on the disk of radius Ra = min(rfl, 1) 
around the origin. Since the series on the right hand side converges at (since all 
the terms vanish), it converges uniformly on compact subsets of the disk of radius 
Ra around the origin. □ 

Corollary 14.5. Let 9i be the generating function of the sequence h,. = rCC{r). Then 

oo 

^{z) = l + Y,(pid)r[C]{z'). 

d=l 

We can combine all of the above results into the following conclusion: 

Theorem 14.6. The generating function as in the statement of corollary 124.51 can be 

expanded as: 
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In particular, 'H has an infinite number of poles, and is not a rational function for any 
k> 1. The generating function !F[CCfJ can he written as 

nCCvMz) = r ^dt 
Jo * 

and so is not a rational function either. 

Proof. The expression for ^ is fairly obvious, with the comment that the second 
summand is a consequence of the fact that 



d\n 

That has an infinite number of poles follows from the observation that the rf-th 
term in the third summand has its d poles on the circle |z| = {2k - while the 

first two summands are analytic in the open unit disk. The expression for J^YCC^^ 
is immediate. □ 

Remark 14.7. For A: = 1, it is not hard to see that 

X 



= 1 + 



{x-ir 



Remark. Various people, when shown Theorem I14.6[ appeared to believe that it 
contradicts [12, Theorem 5.2D]. In fact (as pointed out by Greg McShane), Gro- 
mov's function {N\ is not (as the common misunderstanding has it) the same as 
CCg(?') in the case of a free group, but is the same as Ccif)- 

14.1. Some further comments. The following observation is quite obvious: 

Observation 14.8. Let Gi and G2 be two groups. Then, 

nccc^xcM = ncccAizmccGjiz). 

It would be interesting to find other relationships (for example, what happens 
for HNN extensions?) 

Observation [14]8] has some consequences: 

Theorem 14.9. Let Gi and G2 be two groups, then ifT{CCGi \ rational, while T{CCg2\ 
is not, then T{CCg^xG2\ not rational. If both T{CCg^\ md T{CCg2\ ^''^ rational, then 

so is r[CCG,xG2l 
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Corollary 14.10. IfG-i = Z" and G2 is a finite group, then 'F[CCgixG2] i^ rational. 

Remark. It is not clear whether fiCCc] is rational when G is a Bieberbach group 
- most likely this depends on the choice of the generating set, as conjectured by 
D. B. A. Epstein. 

Corollary 14.11. IfGi = Fk and G2 is a direct product of finite groups and infinite cyclic 
groups, then 'FiCCcixGil is irrational. 

Theorem 14.12. If G = F^^xF^^x . . .X Fk„, then TiCCc] is irrational (with respect to 
the "obvious" generating set). 

Proof. This is an immediate consequence of Theorem 114.61 □ 
15. Primitive conjugacy class zeta function 

One can compute a zeta-function analogous to that of Ihara for the numbers 
of primitive conjugacy classes of a given length (a primitive class is one which 
is not the power of a smaller class), using, essentially, the elementary method 
described by Stark and Terras, 1591 , as applied to the graph constructed in Section 
[U This function turns out to be rational (in fact, there is a simple formula for it, 
see Theorem 115. More precisely, consider 

(15.1) C(G)-i = [](1 + u^c)), 

[c] 

where [c] denotes the equivalences classes of primitive cycles, where two cycles 
are considered equivalent if one can be obtained from the other by a rotation. 
A computation then shows that 

(15.2) C(fr) = (1 - u^y-\l - u){l - (2r - l)u). 

The computation goes as follows: 
First, note that 



I " 



and thus 



1 

log aG) = Y,Y,r' 

[c] i=l 
[c] i=l 

The above can be rewritten (note that the sum is now over primtive cycles, and 
not equivalence classes thereof): 

JJosm ^ y y ^,(, 
du I—iL-i 



! = 1 
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But note that the right hand side is simply the ordinary generating function for all 
cycles: 



(15.3) 



dlogCiG) 
du 



where N, is the number of cycles of length i in G, and this generating function was 
computed in Section [U 



E 



r r - 1 
+ + 



1 + (2r - l)u 1-u 1 + u 



The formula ll5.2l now follows by a straightforward integration. 

An quick examination of the above argument shows that the formula 115.21 is a 
special case of the following result: 

Theorem 15.1. Let G be a finite graph, and let Cg be the zeta function defined by formula 
\15.1\ Let A{G) be the adjacency matrix ofG. Then 

(15.4) Cg{u) = det {I -uA{G)). 

In other words, the zeta function is essentially the characteristic polynomial ofA{G). 



Proof. The argument above up to Equation (|15.3|) is completely general. On the 
other hand, the right hand side of Equation (|15.3|) can be rewritten as: 



YjNiu' = ^trA(G)V 



i=l 



i=l 



tr 



= tr [-1 + uA{G))-^] 
= tr [uA{G){I - uA{G))-^) . 



Thus, 



and so it follows that 



rflogC(G) 



du 



tr (A{G){I-uA{G))-'), 



C(G) = Cdet{I-uA{G)), 



where C is a constant of integration, seen to be equal to 1 by computing both sides 
at 1/ = 0. □ 
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